38 research outputs found
Online Learning of k-CNF Boolean Functions
This paper revisits the problem of learning a k-CNF Boolean function from
examples in the context of online learning under the logarithmic loss. In doing
so, we give a Bayesian interpretation to one of Valiant's celebrated PAC
learning algorithms, which we then build upon to derive two efficient, online,
probabilistic, supervised learning algorithms for predicting the output of an
unknown k-CNF Boolean function. We analyze the loss of our methods, and show
that the cumulative log-loss can be upper bounded, ignoring logarithmic
factors, by a polynomial function of the size of each example.Comment: 20 LaTeX pages. 2 Algorithms. Some Theorem
Sparse Sequential Dirichlet Coding
This short paper describes a simple coding technique, Sparse Sequential
Dirichlet Coding, for multi-alphabet memoryless sources. It is appropriate in
situations where only a small, unknown subset of the possible alphabet symbols
can be expected to occur in any particular data sequence. We provide a
competitive analysis which shows that the performance of Sparse Sequential
Dirichlet Coding will be close to that of a Sequential Dirichlet Coder that
knows in advance the exact subset of occurring alphabet symbols. Empirically we
show that our technique can perform similarly to the more computationally
demanding Sequential Sub-Alphabet Estimator, while using less computational
resources.Comment: 7 page
Reinforcement Learning via AIXI Approximation
This paper introduces a principled approach for the design of a scalable
general reinforcement learning agent. This approach is based on a direct
approximation of AIXI, a Bayesian optimality notion for general reinforcement
learning agents. Previously, it has been unclear whether the theory of AIXI
could motivate the design of practical algorithms. We answer this hitherto open
question in the affirmative, by providing the first computationally feasible
approximation to the AIXI agent. To develop our approximation, we introduce a
Monte Carlo Tree Search algorithm along with an agent-specific extension of the
Context Tree Weighting algorithm. Empirically, we present a set of encouraging
results on a number of stochastic, unknown, and partially observable domains.Comment: 8 LaTeX pages, 1 figur
Approximate universal artificial intelligence and self-play learning for games
This thesis is split into two independent parts.
The first is an investigation of some practical aspects of Marcus Hutter's Universal Artificial Intelligence theory.
The main contributions are to show how a very general agent can be built and analysed using the mathematical tools of this theory.
Before the work presented in this thesis, it was an open question as to whether this theory was of any relevance to reinforcement learning practitioners.
This work suggests that it is indeed relevant and worthy of future investigation.
The second part of this thesis looks at self-play learning in two player, deterministic, adversarial turn-based games.
The main contribution is the introduction of a new technique for training the weights of a heuristic evaluation function from data collected by classical game tree search algorithms.
This method is shown to outperform previous self-play training routines based on Temporal Difference learning when applied to the game of Chess.
In particular, the main highlight was using this technique to construct a Chess program that learnt to play master level Chess by tuning a set of initially random weights from self play games
Context tree switching
This paper describes the Context Tree Switching technique, a modification of Context Tree
Weighting for the prediction of binary, stationary, n-Markov sources. By modifying Context
Tree Weighting’s recursive weighting scheme, it is possible to mix over a strictly larger class of
models without increasing the asymptotic time or space complexity of the original algorithm.
We prove that this generalization preserves the desirable theoretical properties of Context Tree
Weighting on stationary n-Markov sources, and show empirically that this new technique leads
to consistent improvements over Context Tree Weighting as measured on the Calgary Corpus
Reinforcement Learning via AIXI Approximation
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a Monte Carlo Tree Search algorithm along with an agent-specific extension of the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a number of stochastic, unknown, and partially observable domains